feat: auto-compact and retry on context window errors#4
feat: auto-compact and retry on context window errors#4TheArchitectit wants to merge 109 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces an automatic retry mechanism when the model API returns a context window error by compacting the session and resubmitting the request. While the feature is useful, the current implementation has a critical logic error where the retry still uses the original uncompacted session, rendering the compaction ineffective. Feedback also highlights issues with the lifecycle management of the abort monitor, potential UI corruption from reusing a finished spinner, and the need to ensure retries only occur if messages were actually removed during compaction.
| let (mut new_runtime, hook_abort_monitor) = self.prepare_turn_runtime(true)?; | ||
| drop(hook_abort_monitor); // not needed for retry | ||
|
|
||
| // Step 3: Run the turn again with the smaller session | ||
| let mut rp = CliPermissionPrompter::new(self.permission_mode); | ||
| match new_runtime.run_turn(input, Some(&mut rp)) { | ||
| Ok(summary) => { | ||
| // Success! Replace old runtime with the new compacted one | ||
| self.replace_runtime(new_runtime)?; | ||
| spinner.finish( | ||
| "✨ Done (after auto-compact)", | ||
| TerminalRenderer::new().color_theme(), | ||
| &mut stdout, | ||
| )?; |
There was a problem hiding this comment.
This block contains a critical logic error and UI issues:
- Logic Error:
prepare_turn_runtimeis hardcoded to use the session fromself.runtime, which is still the original uncompacted session. Consequently, the retry will be performed with the same oversized context that caused the initial error, leading to a repeated failure. You must construct the runtime usingresult.compacted_session. - Monitor Lifecycle:
hook_abort_monitorshould not be dropped immediately; it must remain active during the retry to handle Ctrl-C interrupts for tool execution (hooks), and then be stopped explicitly after the turn completes. - UI State: Calling
spinner.finishon the original spinner object is problematic becausespinner.failwas already called at line 4288. Reusing a spinner that has reached a terminal state can corrupt terminal output. It is better to use a fresh spinner for the retry attempt.
// Step 2: Build a new runtime with the compacted session and retry
let hook_abort_signal = runtime::HookAbortSignal::new();
let mut new_runtime = build_runtime(
result.compacted_session,
&self.session.id,
self.model.clone(),
self.system_prompt.clone(),
true,
true,
self.allowed_tools.clone(),
self.permission_mode,
None,
)?
.with_hook_abort_signal(hook_abort_signal.clone());
let hook_abort_monitor = HookAbortMonitor::spawn(hook_abort_signal);
// Step 3: Run the turn again with the smaller session
let mut rp = CliPermissionPrompter::new(self.permission_mode);
let mut retry_spinner = Spinner::new();
retry_spinner.tick("🦀 Retrying...", TerminalRenderer::new().color_theme(), &mut stdout)?;
let retry_result = new_runtime.run_turn(input, Some(&mut rp));
hook_abort_monitor.stop();
match retry_result {
Ok(summary) => {
// Success! Replace old runtime with the new compacted one
self.replace_runtime(new_runtime)?;
retry_spinner.finish(
"✨ Done (after auto-compact)",
TerminalRenderer::new().color_theme(),
&mut stdout,
)?;|
|
||
| // Only proceed if compaction actually happened (messages were removed) | ||
| // or there's still a session to work with | ||
| if removed > 0 || result.compacted_session.messages.len() > 0 { |
There was a problem hiding this comment.
The condition removed > 0 || result.compacted_session.messages.len() > 0 is likely too permissive. If removed == 0, the session state remains identical to the one that just failed, meaning the retry will inevitably encounter the same context window error. It is more efficient to only attempt a retry if compaction actually removed messages to free up space.
| if removed > 0 || result.compacted_session.messages.len() > 0 { | |
| if removed > 0 { |
The pull brought the branch current with origin/main while replaying local follow-up work. Conflict resolution kept the roadmap/progress additions and integrated the runtime event/trust changes with upstream's newer surfaces. The trust allowlist now treats worktree_pattern as an additional required predicate, including the missing-worktree case, so auto-trust cannot fall back to cwd-only matching when a worktree constraint was declared. The runtime formatting cleanup keeps clippy/fmt green after the merge. Constraint: Local branch was 109 commits behind origin/main with dirty tracked follow-up work. Rejected: Drop the autostash after conflict resolution | keeping it preserves a reversible safety backup for unrelated recovery. Confidence: high Scope-risk: moderate Directive: Do not relax worktree_pattern matching without preserving the missing-worktree regression. Tested: git diff --cached --check; cargo fmt -p runtime -- --check; cargo clippy -p runtime --all-targets -- -D warnings; cargo test -p runtime; cargo test --workspace; architect verification approved Not-tested: Live tmux/worker auto-trust behavior outside unit/integration tests
Worker boot could previously stall on an interactive MCP/tool permission prompt while readiness and startup-timeout surfaces only had generic idle/no-evidence shapes. This adds a first-class blocked lifecycle state, structured event payload, startup evidence fields, and regression coverage so callers can report the exact server/tool gate instead of pane-scraping. Constraint: ROADMAP ultraworkers#200 requires tool/server identity, prompt age, and session-only versus always-allow capability in status/evidence surfaces Rejected: Treat MCP/tool prompts as trust gates | conflates distinct prompts and loses tool identity Rejected: Leave allow-scope as pane text only | clawhip still could not classify the blocker without scraping Confidence: high Scope-risk: moderate Directive: Keep tool_permission_required distinct from trust_required; downstream claws rely on server/tool payload plus allow-scope metadata Tested: cargo test -p runtime tool_permission Tested: cargo fmt -p runtime -- --check && cargo clippy -p runtime --all-targets -- -D warnings && cargo test -p runtime Tested: cargo test --workspace Not-tested: live interactive MCP permission prompt in tmux
Reject empty --allowedTools inputs instead of treating them as an empty restriction, and surface status JSON metadata that distinguishes default unrestricted tools from flag-provided allow lists. Confidence: high Scope-risk: narrow Tested: cargo test -p rusty-claude-cli rejects_empty_allowed_tools_flag -- --nocapture Tested: cargo test -p tools allowed_tools_rejects_empty_token_lists -- --nocapture Tested: cargo check -p rusty-claude-cli -p tools Tested: cargo test -p rusty-claude-cli -p tools Not-tested: full workspace cargo fmt --check is blocked by pre-existing unrelated formatting drift
Run rustfmt from the Rust workspace so CI format checks pass without changing behavior. Constraint: Scope is formatting-only across tracked Rust files Confidence: high Scope-risk: narrow Tested: cd rust && cargo fmt --check Tested: git diff --check
The Rust crate layout expects formatting to run from the rust directory, so add a root-level wrapper that preserves the working command while forwarding user flags like --check. Documentation now points contributors at the wrapper instead of the misleading virtual-workspace manifest invocation. Constraint: Root-level cargo fmt --manifest-path rust/Cargo.toml is misleading for this virtual workspace Rejected: Document cd rust && cargo fmt directly | a root wrapper gives one stable repo-root command Confidence: high Scope-risk: narrow Tested: scripts/fmt.sh --check Tested: git diff --check
The formatting wrapper should remain safe when invoked through different current directories or shell contexts, so resolve the script directory before entering the Rust workspace and forwarding cargo fmt arguments. Constraint: Wrapper must be runnable from repo root while forwarding flags like --check Rejected: Leave relative dirname cd | less robust if invocation context changes Confidence: high Scope-risk: narrow Tested: scripts/fmt.sh --check Tested: git diff --check
Make scripts/fmt.sh robust to caller cwd and document it as the supported repo-root formatting entrypoint for the Rust workspace.
Operator status previously treated any tmux pane in a workspace as equivalent to active work. The new classifier uses tmux pane command/path metadata as a soft signal, treats plain shells as idle, and adds dirty-worktree abandoned markers to status and session-list output for clawhip consumers. Constraint: Keep issue ultraworkers#320 prototype minimal and additive without new dependencies Rejected: Screen-scraping pane output | fragile and broader than needed for lifecycle classification Confidence: high Scope-risk: narrow Tested: cargo test -p rusty-claude-cli Tested: cargo check -p rusty-claude-cli Not-tested: cargo clippy -p rusty-claude-cli --all-targets -- -D warnings is blocked by pre-existing commands crate clippy::unnecessary_wraps warnings
Keep claw --help's resume-safe slash command summary aligned with the interactive command list by filtering STUB_COMMANDS and adding regression coverage.
…session-lifecycle-classification Fix session lifecycle classification for idle tmux shells
…48-prompt-mode-silent-hang docs(roadmap): add ultraworkers#248 prompt-mode silent-hang pinpoint
…49-issue-github-oauth-opacity docs(roadmap): add ultraworkers#249 issue GitHub OAuth opacity pinpoint
…rruption and session identity contradiction
…22-323-clean docs(roadmap): add ultraworkers#322 ultraworkers#323 — json stream corruption and session identity contradiction
Constraint: Documentation-only follow-up from current main e7074f4 after PR ultraworkers#2838; edit scope limited to ROADMAP.md.\nRejected: Implementing provenance detection now | user requested roadmap entry only.\nConfidence: high\nScope-risk: narrow\nDirective: Future implementation should compare embedded build git_sha/build date to workspace HEAD/dirty state without leaking secrets.\nTested: git diff --check; scripts/fmt.sh --check\nNot-tested: Runtime provenance behavior; this commit only records the roadmap requirement.
…24-stale-binary-provenance docs(roadmap): add ultraworkers#324 stale binary provenance pinpoint
Document the dogfood gap where help JSON stays parseable but hides command metadata inside a prose message, so future implementation can expose machine-readable command, slash-command, and resume-safety fields.\n\nConstraint: user requested ROADMAP.md-only pinpoint for issue ultraworkers#325 from origin/main d607ff3.\nRejected: implementing the schema now | requested fix shape is roadmap documentation only.\nConfidence: high\nScope-risk: narrow\nDirective: keep message for humans while adding schema/versioned structured help metadata when implementing.\nTested: git diff --check; scripts/fmt.sh --check\nNot-tested: runtime CLI behavior unchanged by docs-only change
…25-help-json-schema docs(roadmap): add ultraworkers#325 help json schema opacity pinpoint
…26-dogfood-pinpoint docs(roadmap): add ultraworkers#326 pane inventory opacity pinpoint
Constraint: Scope limited to ROADMAP.md and one new pinpoint ultraworkers#327 from actual rebuilt claw dogfood. Rejected: Code fix in this branch | user requested roadmap-only filing. Confidence: high Scope-risk: narrow Directive: Keep mcp help source lists derived from actual config discovery, not hard-coded partial docs. Tested: ./rust/target/debug/claw version --output-format json; ./rust/target/debug/claw mcp --help; ./rust/target/debug/claw mcp help --output-format json; temp .claw.json mcp list proof; git diff --check; scripts/fmt.sh --check Not-tested: Full Rust test suite, documentation-only change.
…27-dogfood-pinpoint Add ROADMAP ultraworkers#327 for MCP help source mismatch
Constraint: Scope requested ROADMAP.md only with exactly one new ultraworkers#328 pinpoint from direct claw dogfood.\nRejected: Implementing the agents-help fix now | user requested roadmap-only evidence item.\nConfidence: high\nScope-risk: narrow\nDirective: Keep agent help source roots derived from the same loader registry as agents list; do not hand-maintain a divergent root list.\nTested: cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json; ./rust/target/debug/claw version --output-format json; ./rust/target/debug/claw agents help --output-format json; ./rust/target/debug/claw agents --output-format json; git diff --check; scripts/fmt.sh --check\nNot-tested: Full Rust test suite; roadmap-only documentation change.
…28-dogfood-pinpoint Add ROADMAP ultraworkers#328 for native-agent source provenance
Constraint: Respond to dogfood nudge with exactly one concrete clawability pinpoint from direct claw-code use.\nEvidence: rebuilt actual debug binary at git_sha 0f7578c; compared resume-safe /agents --output-format json with top-level claw agents --output-format json.\nFinding: slash /agents JSON only exposes kind,text while top-level agents JSON exposes structured agents[] inventory and provenance.\nTested: cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json; ./rust/target/debug/claw --resume latest /agents --output-format json; ./rust/target/debug/claw agents --output-format json; git diff --check; scripts/fmt.sh --check.\nNot-tested: full Rust suite; roadmap-only documentation change.
…29-slash-agents-json-opacity docs(roadmap): add ultraworkers#329 for slash agents JSON opacity
Constraint: Respond to 14:30 dogfood nudge with one direct claw-code pinpoint.\nEvidence: rebuilt actual debug binary at git_sha 24ccb59; compared top-level help --output-format json with resume-safe /help --output-format json.\nFinding: same help surface uses message in top-level JSON and text in slash/resume JSON.\nTested: cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json; ./rust/target/debug/claw help --output-format json; ./rust/target/debug/claw --resume latest /help --output-format json; git diff --check; scripts/fmt.sh --check.\nNot-tested: full Rust suite; roadmap-only documentation change.
`claw version --output-format json` was missing build_date and executable_path, making it impossible to identify which binary is running or correlate it with a specific build/commit. Fix: version_json_value() now includes: - build_date: compile-time BUILD_DATE env (already in text output) - executable_path: std::env::current_exe() at runtime Test: version_emits_json_when_requested extended to assert both fields are strings in the JSON envelope. Pinpoint: ROADMAP ultraworkers#507
…ltraworkers#2987) Resumed /agents --output-format json was returning a human-readable text render wrapped in a JSON envelope field instead of the actual structured agent list. The run_resume_command handler was calling handle_agents_slash_command (text) for the json field instead of handle_agents_slash_command_json. Fix: use handle_agents_slash_command_json for the json outcome field, matching the pattern already used by /skills and /plugins. Test: extended resumed_inventory_commands_emit_structured_json_when_requested to cover /agents, asserting kind=="agents", action=="list", agents is an array, and count is a number (not a text render).
…raworkers#2992) resumed_status_command_emits_structured_json_when_requested was reading the real ~/.claw/settings.json, causing loaded_config_files to be 1 instead of the expected 0 on machines with user config present. Root cause: unlike other tests (e.g. resumed_config_command_loads_settings_files), this test did not pass an isolated CLAW_CONFIG_HOME env var to run_claw, so claw fell back to the real HOME and loaded the developer's settings file. Fix: create a temp config-home dir and pass it as CLAW_CONFIG_HOME via run_claw_with_env. This gives the assertion a clean 0-file baseline. Unblocks PRs ultraworkers#2973, ultraworkers#2988, ultraworkers#2990 which all failed this same test on main. Ref: ROADMAP ultraworkers#65
…ltraworkers#2988) `claw skills show <name>`, `claw skills info <name>`, and `claw skills list <filter>` were all falling through to SkillSlashDispatch::Invoke, which spawned a real model session, consumed tokens, and created session files. Root cause: classify_skills_slash_command had no guards for these discovery prefixes; every non-reserved arg became Invoke. Fix: - Add "show", "info" as Local-only bare tokens - Add starts_with guards for "show ", "info ", "list " args - handle_skills_slash_command: filter skill list by name/substring for show/info/list-filter paths (no model call, no session) - handle_skills_slash_command_json: same structured filtering Test: skills_show_and_list_filter_do_not_invoke_model asserts classify_skills_slash_command returns Local for all discovery patterns and still returns Invoke for bare skill names. Pinpoint: ROADMAP ultraworkers#502
… subcommands (ultraworkers#2990) `claw config model --output-format json` and all other section subcommands (`env`, `hooks`, `plugins`) returned identical output with no section field — the section arg was parsed but discarded (_section parameter). Fix: render_config_json now: - Passes section through to handler - Looks up the section value via runtime_config.get(), converting the internal JsonValue to serde_json::Value via render()+parse - Emits `section` (string) and `section_value` (JSON value or null) in the response envelope - Returns ok:false + error for unsupported section tokens Test: config_section_json_emits_section_and_value asserts: - No section field when no section arg - section + section_value fields present for all known sections - ok:false + error for unknown section Pinpoint: ROADMAP ultraworkers#126
* fix: support /plugins slash command in resume mode
Move SlashCommand::Plugins out of the 'unsupported resumed slash
command' catch-all and add a handler arm in run_resume_command that
calls handle_plugins_slash_command for list/help actions.
Mutation actions (install/uninstall/enable/disable) are rejected with
a clear error since there is no runtime to reload in resume mode.
Add /plugins coverage to resumed_inventory_commands test in
output_format_contract.rs: kind, action, reload_runtime, target.
Before: claw --resume session.jsonl /plugins --output-format json
-> {error: 'unsupported resumed slash command', type: 'error'}, exit 1
After: claw --resume session.jsonl /plugins --output-format json
-> {kind: 'plugin', action: 'list', ...}, exit 0
* style: cargo fmt line wrap in run_resume_command plugins handler
* fix: block /plugins update in resume mode, fix comment
Address REQUEST_CHANGES from OMX review:
1. Add 'update' to the blocked mutation actions in resume mode
(previously only install/uninstall/enable/disable were blocked)
2. Fix comment: 'Only list is supported' instead of 'Only list/help'
since /plugins help doesn't actually parse as a valid action
* style: cargo fmt after conflict resolution
…ibe/list-filter) (ultraworkers#2989) `claw mcp info nonexistent --output-format json` and `claw mcp list nonexistent --output-format json` fell through to the generic help renderer, returning an opaque envelope with only `unexpected` set — no machine-readable error_kind. Fix: - Add typed guards in render_mcp_report_for/_json_for for: - `list <filter>`: list accepts no filter argument - `info <name>` / `describe <name>`: suggest `mcp show` - New render_mcp_unsupported_action_text/json helpers emit `ok:false`, `error_kind:"unsupported_action"`, `hint`, `requested_action` - `mcp show`, `mcp list`, `mcp help` existing paths unchanged Test: mcp_unsupported_actions_return_typed_error_not_generic_help asserts kind=="mcp", ok==false, error_kind=="unsupported_action" for info/list-filter/describe paths. Pinpoint: ROADMAP ultraworkers#504
…ler (ultraworkers#2993) claw plugin list / claw marketplace / claw marketplace list all fell through to the prompt/LLM path because parse_subcommand only matched "plugins" (the primary name) while the canonical spec aliases "plugin" and "marketplace" were unhandled. This manifested as auth errors and session creation on direct invocation — dogfood confirmed Gaebal's binary created one session via plugin prompt fallback. Fix: extend the plugins arm in parse_subcommand to also match "plugin" | "marketplace" so all three forms route to the same CliAction::Plugins without network calls or session creation. Verified: all six forms (bare + list subcommand for each name) return kind:plugin JSON, exit 0, and create zero sessions. Closes ROADMAP ultraworkers#55 partial (plugins/marketplace bypass complete).
…ling through to LLM (ultraworkers#2994) claw permissions list / claw permissions allow <tool> / claw permissions deny <tool> all fell through to the prompt/LLM path because parse_subcommand had no arm for "permissions". The single-word bare form was already intercepted by bare_slash_command_guidance, but any form with rest.len() > 1 bypassed the single-word guard and landed in the _other => CliAction::Prompt branch. Fix: add a "permissions" arm in parse_subcommand that returns a structured guidance Err so all multi-word forms get the same exit:1 + JSON error as the bare single-word form, without any LLM call or session creation. Verified: all invocation forms (bare, list, read-only, workspace-write, allow/deny <tool>) exit 1 with kind:unknown guidance JSON. Zero sessions.
) * fix(mcp): exit 1 when JSON envelope contains ok:false mcp info, mcp describe, and mcp list-filter all return {"action":"error","ok":false,...} but previously exited 0, requiring automation callers to inspect the envelope field. After this fix: print_mcp detects ok:false in the rendered JSON value and calls process::exit(1) after printing, so the exit code reflects the semantic error in the envelope. Unaffected: mcp list, mcp show, mcp help all have no ok field and continue to exit 0 (they are not error paths). Closes ROADMAP ultraworkers#68 (partial — agents bogus/mcp show nonexistent found:false remain exit:0 as they use different envelope shapes). * feat(scripts): add dogfood-build.sh — build from checkout and verify provenance Builds claw from the current HEAD, then checks that the binary's git_sha matches git rev-parse --short HEAD. Exits non-zero if the binary is stale or provenance is opaque (git_sha: null). Usage: CLAW=$(bash scripts/dogfood-build.sh) # fail-fast if stale $CLAW version --output-format json # provenance confirmed Addresses ROADMAP ultraworkers#69: dogfooders using a stale installed binary cannot attribute behavior to specific commits. This script makes dogfood round zero unambiguous. Also documents the safe workaround for contributors who have a stale system-installed binary.
…passes (ultraworkers#2996) Scripts-only PR — CI intentionally does not run for scripts/ (path filter covers rust/** and docs only). Manually verified: dogfood-build.sh builds, injects GIT_SHA, verifies provenance, and documents CLAW_CONFIG_HOME isolation. Zero stderr with isolated config.
…-json Fix export help JSON output
…t-hardening openai: harden token-limit handling and default output-token caps
…-heavy-dirs runtime: prune heavy directories during glob searches
…om 12 conflicting PRs Batch-appended ROADMAP entries from PRs ultraworkers#2950, ultraworkers#2951, ultraworkers#2953, ultraworkers#2954, ultraworkers#2955, ultraworkers#2956, ultraworkers#2957, ultraworkers#2959, ultraworkers#2960, ultraworkers#2962, ultraworkers#2963, ultraworkers#2964. All PRs were CI-green but conflicting on ROADMAP.md due to serial appends to the same file.
… thinking blocks Five interrelated fixes from parallel Hephaestus sessions: 1. fix(repl): display assistant text after spinner (ultraworkers#2981, ultraworkers#2982, ultraworkers#2937) - Added final_assistant_text() call after run_turn spinner completes - REPL now shows response text like run_prompt_json does 2. fix(compact): handle Thinking content blocks (ultraworkers#2985) - Added ContentBlock::Thinking variant throughout compact summarizer - Prevents panic when /compact encounters thinking blocks 3. fix(prompt): provider-aware model identity (ultraworkers#2822) - New ModelFamilyIdentity enum (Claude vs Generic) - Non-Anthropic models no longer say 'I am Claude' - model_family_identity_for() detects provider and sets identity 4. fix(openai): preserve DeepSeek reasoning_content (ultraworkers#2821) - Stream parser now captures reasoning_content from OpenAI-compat - Emits ThinkingDelta/SignatureDelta events for reasoning models - Thinking blocks included in conversation history for re-send 5. feat(runtime): Thinking block support across codebase - AssistantEvent::Thinking variant in conversation.rs - ContentBlock::Thinking in session serialization - Thinking-aware compact summarization - Tests for thinking block ordering and content Closes ultraworkers#2981, ultraworkers#2982, ultraworkers#2937, ultraworkers#2985, ultraworkers#2822, ultraworkers#2821
…e-fixes fix: REPL display, /compact panic, identity leak, DeepSeek reasoning, thinking blocks
…ck test arity Cherry-pick from Yeachan-Heo's ultraworkers#2945 with manual conflict resolution: - classify_skills_slash_command now catches -h/--help anywhere in args - Restored pending_thinking parameter in push_output_block test calls Co-authored-by: Yeachan-Heo <bellman@ultraworkers.dev>
When the model API returns a context_window_blocked error (because the request exceeds the model's context window), the CLI now automatically: 1. Compact the session (remove old messages to free up space) 2. Retry the original request with the compacted session 3. Report results to the user This eliminates the need for users to manually run /compact when they hit context limits - the recovery happens automatically. ## Technical Details - Detection: Looks for 'context_window' or 'Context window' in error message - Uses runtime::compact_session() to aggressively compact (max_estimated_tokens=0) - Creates new runtime with compacted session and retries the turn - Reports compaction results and final status to user ## Testing Tested successfully with a request that exceeded model's context: - Auto-compact triggered: 'Messages removed 19, Messages kept 5' - Successfully retried and completed after compaction
Some OpenAI-compatible providers (e.g., GLM-5) omit the `id` field in streaming and non-streaming responses. Adding #[serde(default)] allows the parser to accept these responses instead of failing with "missing field `id`". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds scripts/install.sh that builds the release binary and links it to ~/.local/bin/claw. Run after code changes to update the CLI. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a provider returns HTML (e.g., error page, wrong endpoint) instead of JSON in an SSE stream, provide a clear error message instead of hanging or failing with a cryptic parse error. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a provider returns a JSON error (e.g., {"error":{"message":"..."}})
without SSE framing (no "data:" prefix), the SSE parser was silently
ignoring it and hanging. Now detects and surfaces these errors.
Also handles HTML responses that lack SSE framing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Some providers (GLM, DeepSeek) emit reasoning tokens in `reasoning_content` or nested `thinking.content` fields instead of `content`. Added support for these fields so reasoning models work correctly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The final streaming chunk from some providers contains only finish_reason and usage, with no delta field. Made it optional to prevent parse errors. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When preserve_recent_messages == 0, raw_keep_from equals messages.len(), causing index out of bounds when accessing session.messages[k]. Added k >= session.messages.len() check to prevent panic. Reason: Compaction with preserve_recent_messages=0 triggered OOB access when checking for tool-use/tool-result pair preservation at boundary. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
6a37558 to
a15c602
Compare
Problem
When a conversation grows large enough to exceed the model's context window, the API returns a
context_window_blockederror. Previously, this would fail the request and require the user to manually compact the session (or start over), interrupting the workflow.Solution
This PR implements automatic session compaction with transparent retry:
context_window_blockederrors from the APIFlow
Key Behaviors
Testing
Files Changed
rust/crates/runtime/src/conversation.rs— auto-compact retry logic in the request pathrust/crates/api/src/error.rs—context_window_blockederror detectionImpact